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Abstract. Syndrome coding has been proposed by Crandall in 1998 
as a method to stealthily embed a message in a cover-medium through 
the use of bounded decoding. In 2005, Fridrich et al. introduced wet 
paper codes to improve the undetectability of the embedding by enabling 
the sender to lock some components of the cover-data, according to the 
nature of the cover-medium and the message. Unfortunately, almost all 
existing methods solving the bounded decoding syndrome problem with 
or without locked components have a non-zero probability to fail. In this 
paper, we introduce a randomized syndrome coding, which guarantees 
the embedding success with probability one. We analyze the parameters 
of this new scheme in the case of perfect codes. 
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1 Introduction 

Hiding messages in innocuous-looking cover-media in a stealthy way, 
steganography is the art of stealth communications. The sender and re- 
ceiver may proceed by cover selection, cover synthesis, or cover modifi- 
cation to exchange messages. Here, we focus on the cover modification 
scenario, where the sender chooses some cover-medium in his library, and 
modifies it to carry the message she wants to send. Once the cover-medium 
is chosen, the sender extracts some of its components to construct a cover- 
data vector. Then, she modifies it to embed the message. This modified 
vector, called the stego-data, leads back to the stego-medium that is com- 
municated to the recipient. In the case of digital images, the insertion may 
for example consist in modifying some of the images components, e.g. the 
luminance of the pixels or the values of some transform (DCT or wavelet) 
coefficients. For a given transmitted document, only the sender and re- 
ceiver have to be able to tell if it carries an hidden message or not |33j. 
This means that the stego-media, which carry the messages, have to be 



statistically indistinguishable from original media |6|7j . But statistical 
detectability of most steganographic schemes increases with embedding 
distortion [21], which is often measured with the number of embedding 
changes. Hence it is of importance for the sender to embed the message 
while modifying as less components of the cover-data as possible. 

In 1998, Crandall proposed to model the embedding and extraction 
process with the use of linear error correcting codes. He proposed to 
use Hamming codes, which are covering codes [9]. The key idea of this 
approach, called syndrome coding, or matrix embedding, is to modify the 
cover-data to obtain a stego-data lying in the right coset of the code, its 
syndrome being precisely equal to the message to hide. Later on, it has 
been showed that designing steganographic schemes is precisely equivalent 
to designing covering codes |3|22|23j . meaning that this covering codes 
approach is not restrictive. Moreover, it has been shown to be really 
helpful and efficient to minimize the embedding distortion [3 22 23 4J. It 
has also been made popular due to its use in the famous steganographic 
algorithm F5 |36j . For all these reasons, this approach is of interest. 

The process which states which components of the cover-data can ac- 
tually be modified is called the selection channel pQ. Since the message 
embedding should introduce as little distortion as possible, the selection 
channel is of utmost importance. The selection channel may be arbitrary, 
but a more efficient approach is to select it dynamically during the em- 
bedding step, accordingly to the cover-medium and the message. This 
leads to a better undetect ability, and makes attacks on the system harder 
to run, but in this context the extraction of the hidden message is more 
difficult as the selection channel is only known to the sender, and not to 
the recipient. Wet Paper Codes were introduced to tackle this non-shared 
selection channel, through the notions of dry and wet components |18j . 
By analogy with a sheet of paper that has been exposed to rain, we can 
still write easily on dry spots whereas we cannot write on wet spots. The 
idea is, adaptively to the message and the cover-medium, to lock some 
components of the cover-data — the wet components — to prevent them 
being modified. The other components — the dry components — of the 
cover-data remain free to be modified to embed the message. 

Algorithmically speaking, syndrome coding provides the recipient an 
easy way to access the message, through a simple syndrome computa- 
tion. But to embed the message, the sender has to tackle an harder 
challenge, linked with bounded syndrome decoding. It has been shown 
that if random codes may seem interesting for their asymptotic behav- 
ior, their use leads to solve really hard problems: syndrome decoding 



and covering radius computation, which are proved to be NP-complete 
and Incomplete respectively |34|25| , Moreover, no efficient decoding 
algorithm is known, for generic, or random, codes. Hence, attention 
has been given on structured codes to design Wet Paper Codes: Ham- 
ming codes [9I2T] . Simplex codes [20], BCH codes [31I32I37I3UI2T] . Reed- 
Solomon codes [14|15j . perfect product codes |29|28j . low density genera- 
tor matrix codes |17|39|38|10j . and convolutional codes |13|11|12] . 

Embedding techniques efficiency is usually evaluated through their 
relative payload (number of message symbols per cover-data (modifiable) 
symbol) and average embedding efficiency (average number of message 
symbols per cover-data modification). Today, we can find in the litera- 
ture quasi-optimal codes in terms of average embedding efficiency and 
payload [U 39 38 1 6110] . Nevertheless, we are interested here in another 
criterion, which is usually not discussed: the probability for the embed- 
ding to fail. In fact, the only case for which it never fails is when using 
perfect codes (a), without locking any component of the cover-data (b). 
But very few codes are perfect (namely the Hamming and Golay codes), 
and their average embedding efficiency is quite low. Moreover it is re- 
ally important in practice to be able to lock some components of the 
cover-data. Hence, efficient practical schemes usually do not satisfy ei- 
ther condition (a) or condition (b), leading to a non-zero probability for 
the embedding to fail. And this probability increases with the number of 
locked components. More precisely, syndrome coding usually divides the 
whole message into fragments, that are separately inserted in different 
cover-data vectors (coming from one or several cover- medium) . Inserting 
each fragment involves finding a low weight solution of a linear system 
which may not always have a solution for a given set of locked components. 
Consequently, the probability that the whole message can be embedded 
decreases exponentially with the number of fragments to hide and with 
the number of locked components |21j 

Hence, we have to decide what to do when embedding fails. In the 
common scenario where the sender has to choose a cover-medium in a 
huge collection of documents, she can drop the cover-medium that leads 
to a failure and choose another one, iterating the process until finding a 
cover-medium that is adequate to embed the message. Another solution 
may be to cut the message into smaller pieces, in order to have shorter 
messages to embed, and a lower probability of failure. If none of these 
is possible, for example if the sender only has few pieces of content, she 
may unlock some locked components [13] to make the probability of failure 
decrease. But, even doing this modified embedding, and decreasing the 



probability of failure, the sender will not be able to drop it to zero, except 
if she falls back to perfect codes without locked components. 

In this paper, we consider the "worst case" scenario, where the sender 
does not have too much cover documents to hide his message in, and 
then absolutely needs embedding to succeed. This scenario is not the 
most studied one, and concerns very constrained situations. Our contri- 
bution is to propose an embedding scheme that will never fail, and does 
not relax the management of locked components of his cover-data to make 
embedding succeed. It is, to our knowledge, the first bounded syndrome 
coding scheme that manages locked components while guaranteeing the 
complete embedding of the message for any code, be it perfect or not. To 
do so, we modify the classical syndrome coding approach by using some 
part of the syndrome for randomization. Of course, as the message we can 
embed is now shorter than the syndrome, there is a loss in terms of em- 
bedding efficiency. We analyze this loss in the case of linear perfect codes. 
Moreover, inspired by the ZZW construction [39|, we show how the size of 
the random part of the syndrome, which is dynamically estimated during 
embedding, can be transmitted to the recipient without any additional 
communication . 

The paper is organized as follows. Basic definitions and notation on 
both steganography and syndrome coding are introduced in Section [2J 
The traditional syndrome coding approach is recalled at the end of this 
section. In Section El we show how to slightly relax the constraints on 
the linear system to make it always solvable, and also estimate the loss of 
embedding efficiency. We discuss the behavior of our scheme in the case of 
the Golay and Hamming perfect codes in Section^ Finally, as our solution 
uses a parameter r that is dynamically computed during embedding, we 
provide in Section [5] a construction that enables to transmit r to the 
recipient through the stego-data itself, that is, without any parallel or 
side-channel communication. We finally conclude in Section [6j 

2 Steganography and coding theory 
2.1 Steganographic schemes 

We define a stego- system (or a steganographic scheme) by a pair of func- 
tions, Emb and Ext. Emb embeds the message m in the cover-data x, 
producing the stego-data y, while Ext extracts the message m from the 
stego-data y. To make the embedding and extraction work properly, these 
functions have to satisfy the following properties. 



Definition 1 (Stego-System). Let A a finite alphabet, r, n E N such 
that r < n, x G A n denote the cover-data, m € A r denote the message 
to embed, and T be a strictly positive integer. A stego-system is defined 
by a pair of functions Ext and Emb such that: 

Ext(Emb(x, m)) = m (1) 
<i(x, £Yn6(x, m)) < T (2) 

where .) denoting the Hamming distance over A n . 

Two quantities are usually used to compare stego-systems: the embedding 
efficiency and the relative payload, which are defined as follows. 

Definition 2 (Embedding efficiency). The average embedding effi- 
ciency of a stego-system, is usually defined by the ratio of the number of 
message symbols we can embed by the average number of symbols changed. 
We denote it by e. 

Definition 3 (Relative payload). The relative payload of a stego- 
system, denoted by a, is the ratio of the number of message symbols we 
can embed by the number of (modifiable) symbols of covered data. 

For g-ary syndrome coding, the sphere-covering bound gives an upper 
bound for the embedding efficiency [16]. Note that it is usually stated for 
binary case, using the binary entropy function. 

Proposition 1 (Sphere-covering bound). For any q-ary stego-system 
S, the sphere-covering bound gives 

a 

e < , 

" Hq l {a) 

where T~L~ l {) denotes the inverse function of the q-ary entropy 
H q (x) = x\og q (q — 1) — xlog g (x) — (1 — x) log g (l — x) on [0, 1 — 1/q], and 
a is the relative payload associated with S. 

2.2 From coding theory to steganography 

This section recalls how coding theory may help embedding the message, 
and how it tackles the non-shared selection channel paradigm. In the rest 
of paper, the finite alphabet A is a finite field of cardinal q, denoted ¥ q . 

Here we focus on the use of linear codes, which is the most studied. 
Let C be a [n, k, d] q -lme&r code, with parity check matrix H and covering 



radius p — it is the smallest integer such that the balls of radius p centered 
on C's codewords cover the whole ambient space F™. A syndrome coding 
scheme based on C basically modify the cover-data x in such a way that 
the syndrome yH l of the stego-data y will precisely be equal to the 
message m. Determining which symbols of x to modify leads to finding 
a solution of a particular linear system that involves the parity check 
matrix H. This embedding approach has been introduced by Crandall in 
1998 [9], and is called syndrome coding or matrix embedding. 

We formulate several embedding problems. The first one addresses 
only Eq. (TjTJ requirements, whereas the second one also tackles Eq. ([2]). 

Problem 1 (Syndrome coding problem). Let C be an [n, k,d] q linear code, 
H be a parity check matrix of C, x E F™ be a cover-data, and m E F" _fc 
be the message to be hidden in x. The syndrome coding problem consists 
in finding y E F™ such that yH l = m. 

Problem 2 (Bounded syndrome coding problem). Let C be an [n,k,d] q 
linear code, H be a parity check matrix of C, x E F" be a cover-data, 
m E Fg _fc be the message to be hidden in x, and T E N* be an upper 
bound on the number of authorized modifications. The bounded syndrome 
coding problem consists in finding y E F™ such that yH l = m, and 
d(x,y) <T. 

Let us first focus on Problem [TJ which leads to describing the stego-system 
in terms of syndrome computation: 

y = Emb(~K, m) = x + D(m — x.H l ), 
Ext{y) = yH\ 

where D is the mapping associating to a syndrome m, a vector whose 
syndrome is precisely equal to m. The mapping D is thus directly 
linked to a decoding function fc of C of arbitrary radius Tf, defined 
as f c :F™ — >CU{?}, such that for all y E F™, either / c (y) =?, or 

d(y,/c(y)) <T f . 

The Hamming distance between vectors x and y is then less than or 
equal to Tf. Since decoding general codes is NP-Hard [2], finding such a 
mapping D is not tractable if C does not belong to a family of codes we 
can efficiently decode. Moreover, to be sure that the Problem 2 always 
has a solution, it is necessary and sufficient that fc can decode up to 
the covering radius of C. This means that solving Problem [2] with T = p 
is precisely equivalent to designing a stego-system which find solutions 
to both Eqs. (pQ) and ([2]) requirements for any x and m. In this context, 



perfect codes, for which the covering radius is precisely equal to the error- 
correcting capacity (p = [^-}), are particularly relevant. 

Unfortunately using perfect codes leads to an embedding efficiency 
which is far from the bound given in Prop. Q] [4j. Hence non-perfect codes 
have been studied (see the Introduction), even if they can only tackle 
Problem [2] for some T much lower than p. This may enable to force the 
system to perform only a small number of modifications. 

As discussed in the introduction, Wet paper codes were introduced to 
improve embedding undetectability through the management of locked, 
or wet, components |18| . 

Problem 3 (Bounded syndrome wet paper coding problem). Let C be an 
[n, k, d] q linear code, H be a parity check matrix of C, x 6 F™, m G F™ _fc , 
T G N*, and a set of locked, or wet, components X C {1, . . . , n}, t = \I\. 
The Bounded syndrome wet paper coding problem consists in finding 
y G F™ such that yH* = m, d(x, y) < T, and Xj = yj for all i G I. 

Of course, solving Problem [3] is harder and even perfect codes may fail 
here. More precisely to deal with locked components, we usually decom- 
pose the parity check matrix H of C in the following way |18|19j : 

yH l = m, 

y\i H \i + y\i H \x = m ' 

y\t H \i = m - y\xHfa, 

where I = {1, . . . , n} \ X. The previous equation can only be solved if 
rank(Hf) = n — k. Since the potential structure of H does not help to 
solve the previous problem, we could as well choose H to be also a random 
matrix, which provides the main advantage to maximize asymptotically 
the average embedding efficiency |22)19| . 

Hiding a long message requires to split it and to repeatedly use the 
basic scheme. Let Ph the success probability for embedding (n — k) sym- 
bols, then the global success probability P for a long message of length 
L(n — k) is Ph. This probability decreases exponentially with the message 
length. 

In order to bypass this issue, previous works propose either to take 
another cover-medium, or to modify some locked components. In this 
paper, we still keep unmodified the locked components, thus maintaining 
the same level of undetectability. Moreover, we tackle the particular case 
where the sender does not have a lot of cover-media available, and needs a 
successful embedding, even if this leads to a smaller embedding efficiency. 



In the original Wet Paper Setting of [IB], the embedding efficiency is 
not dealt with. In that case, we have a much easier problem. 

Problem 4 (Unbounded wet paper Syndrome coding problem). Let C be an 
[n, k, d] q linear code, H be a parity check matrix of C, x G F™, m G Fg _fc , 
and a set of locked components Z C {1, . . . , n}, i = \L\. The Unbounded 
wet paper Syndrome coding problem consists in finding y G F™ such that 
yH t = m, and Xj = y^, for all i G I. 

In a random case setting, this problem can be discussed using a lower 
bound on random matrices, provided by [5]. 

Theorem 1. Let M be a random ncol x nrow matrix defined over ¥ q , 
such that ncol > nrow. We have: 

J 0.288, if ncol = nrow and q = 2, 

P (rank(M) = nrow) > < -. i 

I J- qncol-nrow^q_iy OlfierWISe. 

In a worst-case, or infallible, setting, the relevant parameter of the code 
is its dual distance. 

Proposition 2. Consider a q-ary wet channel on length n with at most 
i wet positions, and that there exists a q-ary code C whose dual code 
has parameters [n, k^-^d 1 - = £] q with k 1 - + d 1 - = n + 1 — g. Then we can 
surely embed n — £ — g symbols using a parity check matrix of C. 

Proof. This can be derived from |26} Theorem 2.3]. 

This means that if the code is g far from the Singleton bound, then we 
loose g information symbols with respect to the maximum. In particular, 
if n < q, there exists a g-ary Reed-Solomon code with g = 0, and we 
can always embed n — I symbols when there are I wet symbols. Coding 
theory bounds tells us that the higher q, the smallest g can be achieved, 
eventually using Algebraic-Geometry codes [35] . 



3 Randomized (wet paper) syndrome coding 

Since embedding a message has a non-zero probability to fail, we propose 
to relax the constraints in the following way: 

Problem 5 (Randomized bounded syndrome coding problem for wet pa- 
per). Let C be an [n, k, d] q linear code, H be a parity check matrix of C, r 
and T be two integers, x G F™, m G F™ _fc_r be the message to embed, and 



T. C {1, . . . , n} be the set of locked components, t = \I\. Our randomized 
syndrome coding problem for wet paper consists in finding y E FJ and 
R € such that (i) yH l = (m||R), and || denotes the concatenation 
operator, (ii) d(x, y) < T, and (iii) Xj = y^, for all i £ I. 

We thus randomize one fraction of the syndrome to increase the number 
of solutions. This gives a degree of freedom which may be large enough 
to solve the system. The traditional approach can then be applied to find 
y|j and consequently y. Using some random symbols in the syndrome 
was used in the signature scheme of Courtois, Finiasz and Sendrier [8]. 
While this reformulation allows to solve the bounded syndrome coding 
problem in the wet paper context without failure, we obviously lose some 
efficiency compared to the traditional approach. 

We now estimate the loss in embedding efficiency for a given number 
of locked components. Let e denote the embedding efficiency of the tradi- 
tional approach, and e' denote the efficiency of the randomized one. We 
obtain a relative loss of: 

e — e' r 
e n — /c ' 

while being assured that any n — k — r message be embedded, as long as 
r < n — k. 

Optimizing the parameter r is crucial, to ensure that our reformu- 
lated problem always has a solution, while preserving the best possible 
embedding efficiency. This is the goal on next Section. 

4 Case of perfect linear codes 

We discuss in this Section a sufficient condition on the size r of random- 
ization, for our reformulated problem to always have a solution. 

4.1 General Statement 

The syndrome function associated with H, noted Sh , is defined by: 

Sh ■ F™ — > F£~ fc 

x I — >TtH l . 

This function Sh is linear and surjective, and satisfies the following well- 
known properties. Let B(x,T) denote the Hamming ball of radius T cen- 
tered on x. 



Proposition 3. Let C be an [n, k, d] q -linear code, with covering radius p, 
H a parity check matrix of C, and Su the syndrome function associated 
with H. For all x € F" the function Sh restricted to <6(x, |_^i^J) ^ s 
one-to-one, the function Sh restricted to B(x, p) is surjective. When C is 
perfect, the syndrome function restricted to £>(x, p) is bijective. 

Now, we give a sufficient condition for upper-bounding r in Problem 

Proposition 4. Given a [n, k, d] perfect code with p^-, if the inequality 

q n-k + l ^ q r + J2 (q _ 1 yfn-i\ (3) 

i=0 \ 1 / 

is satisfied, then there exists a vector y £ F™ and a random vector R, 
which are solution of Problem O In this case, Problem always has a 
solution y. 

Proof. Let N± — respectively N2 — be the number of different syndromes 
generated by the subset of F" satisfying (i) of Problem [5] — respectively 
(ii) and (iii). If 

Nx + N 2 > q n - k . (4) 

Then there exists y which fulfills conditions (i) , (ii) , and (iii) . The number 
of different syndromes satisfying by the first constraint, for all R, is q r . 
Keeping in mind that I components are locked and the syndrome function 
restricted to £>(x, p) is bijective, then 

iff" 7 *)■ 

Combined with the sufficient condition (j4|) we obtain the result. 

Next Section is devoted to the non trivial perfect codes: the Golay codes, 
and the (g-ary) the Hamming codes. 

4.2 Golay codes 

Binary Golay code We start by study the case of the binary [23, 12, 7] 2 
Golay code, which is perfect. The inequality of the proposition H] gives 



Ternary perfect Golay code The ternary Golay code has parameters 
[11,6,5]3. Using the Proposition [H we obtain: 

r > log 3 (l + U£ - It 2 ) . (6) 

Eqs [S] and does not say much. We have plotted the results in Fig. [JJ 
and we see that the number of available bits for embedding degrades very 
fast with the number of locked positions. 
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Fig. 1. Size of the random part for the two Golay codes. The number of remaining bits 
is plotted, in terms of the number of locked positions. 



4.3 Hamming codes 

We study the infinite family of Hamming codes. We find r, analyze the 
found parameters, and study its asymptotic behavior. 



Computation of r Let C be a \{q p — 1) /(q — l),n—p, 3] q Hamming code 
over Fg, for some p. Its covering radius is p = 1, and thus its embedding 
efficiency if p. We aim to minimize r, the length of the random vector R. 
Since q n ~ k = q p , (q p — l)/(q — 1) = n, Proposition H] gives: 

r > log, (1 + (q - 1)1) . (7) 



Analysis of parameters In order to find an extreme case, it we maxi- 
mize the number of locked components t while still keeping n — k — r > 1. 



A direct computation gives: 



p-l = log g ((q-l)£ + l), 

I - ~ 1 ~ n 
q-1 q 

Therefore, using Hamming codes, we can embed at least one information 
symbol if no more than a fraction of | of the components are locked. 
This is of course best for q = 2. The minimum r which satisfies inequal- 
ity (|7J) is r = \\og q ((q— 1)£+1)~|. In other words, for Hamming codes, the 
minimum number of randomized symbols needed to guarantee that the 
whole message can be embedded, is logarithmic in the number of locked 
components. Our randomized approach always solves successfully Prob- 
lem [5] while traditional syndrome coding (including wet paper) exhibits a 
non-zero failure rate, when — < -. 

' n q 

Asymptotic behavior Now we evaluate the loss in embedding effi- 
ciency Then, for a given £, the relative loss of the embedding efficiency 
is given by: 

riog g (( g -i)i+i)i 
p 

To conclude this section, we propose to focus on the normalized loss in 
symbols for the family of Hamming codes. We assume that the rate of 
£, the number of locked components to compare to n, the length of the 
cover-data stays constant, i.e. i = An, for a given A € [0, -(. Then the 
asymptotic of relative loss is 

\og q {{q-\)l + \) log ff (n(?-l)A) log, A 

P p p 

This goes to 1 when p goes to infinity, i.e. all the symbols of syndrome 
are consumed by the randomization. It makes sense, since dealing with a 
given proportion A of arbitrarily locked symbols in a long stego-data is 
much harder than dealing with several smaller stego-data with the same 
proportion A of locked positions. 

5 Using ZZW construction to embed dynamic parameters 

In the approach given in previous Section, the sender and recipient have 
to fix in advance the value of r. Indeed the recipient has to know which 



part of syndrome is random. This is not very compliant with the Wet 
Paper model, where the recipient does not know the quantity of wet bits. 
We propose in this Section a variant of ZZW's scheme [39], which enables 
to convey dynamically the value r, depending on the cover-data. 

5.1 The scheme 

We consider that we are treating n blocks of 2 P — 1 bits, xi, . . . ,x n , for 
instance displayed as in Figure [2) Each block Xj is a binary vector of 
length 2 P — 1, set as column, and we let v = (vi, . . . , v n ) be the binary 
vector whose i-th coordinate vi is the parity bit of column Xj. We use the 
(virtual) vector v to convey extra information, while at the same time 
the Xj are using for syndrome coding. 

Our scheme is threefold : syndrome coding on the Xj's using the parity 
check H\ of a first Hamming code, with our randomized method, then 
(unbounded wet paper) syndrome embedding on the syndromes s$'s of 
the Xj's. This second syndrome embedding see the Sj as q-ary symbols, 
and the matrix in use is the parity check matrix H q of a q-ary Reed- 
Solomon code. We call the n first embeddings the .f/i -embeddings, and 
the second one the -ffq-embedding. Finally, we use v to embed dynamic 
information: the number r of random bits, and / the number of failure 
in the Hi -embeddings. We call this last embedding the ./^-embedding, 
where H<i is the parity check matrix of a second, much shorter, binary 
Hamming code. 

We assume that r is bounded by design, say r < r max . We shall see, 
after a discussion on all the parameters, that this is one design parameter 
of the scheme, together with o, which the precision, in bits, for describing 
real numbers e)^, 1]. 

Embedding 

Inspect. Each column xi, . . . ,x n is inspected, to find the number of 
dry bits in each. This enables to determine the size r of the randomized 
part, which shall be the same for all columns. This determines the columns 
Xj's where the Hi -embeddings are feasible. Let f be the number of Xj's 
where the Hi -embeddings fail. 

Build the wet channel. For each of the n — f columns Xj's where the 
i^i-embedding is possible, there is a syndrome Sj of p bits, where the last 
r bits are random, thus wet, and the p — r first bits are dry. We consider 
these blocks of p — r bits as a g-ary symbols, with q = 2 p ~ r . Thus we 
have a g-ary wet channel with n — f dry g-ary symbols, and / wet g-ary 
symbols 



Embed for the wet channel. Then, using a Reed-Solomon over the 
alphabet ¥ q , we can embed (n — /) q-aiy symbols, using a n x (n — f) 
g-ary parity check matrix H q of the code. Note that the number of rows 
of this matrix is dynamic since / is dynamic. 

Embed dynamic data. We have to embed dynamic parameters r and 
/ which are unknown to the recipient, using ZZW's virtual vector v. For 
this binary channel, the dry bits Vi correspond to the columns x, where 
the iii-embedding has failed, and where there is at least one dry bit in Xj. 
A second Hamming code is used with parity check H2 for this embedding. 

Recovery 

H2- extraction. First compute v, and using the parity check matrix of 
the Hamming code H2, extract r and /. 

H\- extraction Extract the syndromes of all the column x$'s using the 
parity check matrix Hi, and collect only the first p—r bits in each column, 
to build g-ary symbols. 

H q -extraction Build the parity check matrix H q of the g-ary [n, f] q 
Reed-Solomon code, with q = 2 p ~ r . Using this matrix, get the (n — /) 
g-ary information symbols, which are the actual payload. 
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Fig. 2. A graphical view of our scheme inspired from ZZW. A syndrome Si is con- 
sidered wet for the f/q-embedding when the _ffi-embedding is not feasible. Then the 
corresponding bit Vi in the vector v is dry for the JiVembedding. Wet data is grey on 
the Figure. 



5.2 Analysis 



There are several constraints on the scheme. 

First, for a Reed- Solomon code of length n to exist over the alpha- 
bet F 2P -r, we must have n < 2 p ~ r , for any r, i.e. n < 2 p ~ rmax . We fix 
n = 2 p_rmax — 1, and let us briefly denote u = p — r max . 

Then the binary [n = 2 U — 1, 2 U — u — V\% Hamming code, with par- 
ity check matrix H2, is used for embedding in the vector v, with / dry 
symbols. This a unbounded wet channel. From Proposition [21 we must 
have 

/ > 2"- 1 , (8) 

which implies that some columns Xj may be artificially declared wet, for 
satisfying Eq. [HJ Third, we also must have 

u = [log r max ] + [log /max] , (9) 

to be able to embed r and /. Since / < 2 U — 1, we have [log/ max ] = u. 
Eq. [9] becomes u = [log r max ] + u, this is clearly not feasible. To remedy 
this, instead of embedding /, we embed its relative value f u = € [-5, 1], 
up to a fixed precision, say o bits, with o small. Then Eq. [9] is replaced 
by 

u = [logr max ] + o, (10) 
P = r max + [log rmax] +0, (11) 

which is a condition easy to fulfill. It is also possible, by design, to use 
the all-one value of f u as an out-of-range value to declare an embedding 
failure. The scheme is locally adaptive to the media: for instance, in a 
given image, r and / may take different values for different areas of the 
image. 

In conclusion, the number of bits that we can embed using that scheme 
is bounded by (n — f)(p — r) < 2 u ~ l (p — r), with dynamic r and /. 

6 Conclusion 

In this paper, we addressed the "worst-case" scenario, where the sender 
cannot accept embedding to fail, and does not want relax the management 
of locked components of his cover-data. As traditional (wet) syndrome 
coding may fail, and as the failure probability increases exponentially 
with the message length, we proposed here a different approach, which 



never fails. Our solution is based on the randomization of a part of the 
syndrome, the other part still carrying symbols of the message to trans- 
mit. While our method suffers from a lost of embedding efficiency, we 
showed that this loss remains acceptable for perfect codes. Moreover, we 
showed how the size of the random part of the syndrome, which is dynam- 
ically estimated during embedding, may be transmitted to the recipient 
without any additional communication. 
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